Measuring Word Alignment Quality for Statistical Machine Translation

نویسندگان

  • Alexander M. Fraser
  • Daniel Marcu
چکیده

Automatic word alignment plays a critical role in statistical machine translation. Unfortunately the relationship between alignment quality and statistical machine translation performance has not been well understood. In the recent literature the alignment task has frequently been decoupled from the translation task, and assumptions have been made about measuring alignment quality for machine translation which, it turns out, are not justified. In particular, none of the tens of papers published over the last five years has shown that significant decreases in Alignment Error Rate, AER (Och and Ney, 2003), result in significant increases in translation quality. This paper explains this state of affairs and presents steps towards measuring alignment quality in a way which is predictive of statistical machine translation quality.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Squibs and Discussions: Measuring Word Alignment Quality for Statistical Machine Translation

Automatic word alignment plays a critical role in statistical machine translation. Unfortunately, the relationship between alignment quality and statistical machine translation performance has not been well understood. In the recent literature, the alignment task has frequently been decoupled from the translation task and assumptions have been made about measuring alignment quality for machine ...

متن کامل

All Links are not the Same: Evaluating Word Alignments for Statistical Machine Translation

Word alignments, the mappings between source and target language words for two languages, are a critical component of statistical machine translation. A long-standing issue in statistical machine translation is that the quality of word alignments does not correlate as well as would be expected with measures of translation quality. A number of recent papers have shed light on this issue by impro...

متن کامل

Enhancing Statistical Machine Translation with Character Alignment

The dominant practice of statistical machine translation (SMT) uses the same Chinese word segmentation specification in both alignment and translation rule induction steps in building Chinese-English SMT system, which may suffer from a suboptimal problem that word segmentation better for alignment is not necessarily better for translation. To tackle this, we propose a framework that uses two di...

متن کامل

Semi-Supervised Training for Statistical Word Alignment

We introduce a semi-supervised approach to training for statistical machine translation that alternates the traditional Expectation Maximization step that is applied on a large training corpus with a discriminative step aimed at increasing word-alignment quality on a small, manually word-aligned sub-corpus. We show that our algorithm leads not only to improved alignments but also to machine tra...

متن کامل

Improving Function Word Alignment with Frequency and Syntactic Information

In statistical word alignment for machine translation, function words usually cause poor aligning performance because they do not have clear correspondence between different languages. This paper proposes a novel approach to improve word alignment by pruning alignments of function words from an existing alignment model with high precision and recall. Based on monolingual and bilingual frequency...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Linguistics

دوره 33  شماره 

صفحات  -

تاریخ انتشار 2007